为了安全地在现实世界中部署腿部机器人,有必要为他们提供可靠地检测出意外接触并准确估算相应接触力的能力。在本文中,我们提出了针对四足动物的碰撞检测和识别管道。我们首先引入了一种基于带通滤波的碰撞时间跨度的方法,并证明此信息是获得准确的碰撞力估计值的关键。然后,我们通过补偿模型不准确性,未建模的载荷以及作用在机器人上的任何其他潜在的准静态干扰来源来提高所识别力量幅度的准确性。在各种情况下,我们通过广泛的硬件实验来验证我们的框架,包括小跑和机器人上的其他未建模负载。
translated by 谷歌翻译
模型预测控制(MPC)方案已经证明了它们在控制高自由度(DOF)复杂机器人系统方面的效率。但是,它们的计算成本很高,更新速度约为数十万。这种相对较慢的更新速率阻碍了这种系统稳定的触觉远程操作的可能性,因为缓慢的反馈回路可能会导致对操作员的不稳定性和透明度的丧失。这项工作为MPC控制的复杂机器人系统的透明远程操作提供了一个新颖的框架。特别是,我们采用反馈MPC方法并利用其结构来以快速速率计算运营商输入,该快速速率与MPC循环本身的更新率无关。我们在移动操纵器平台上演示了我们的框架,并表明它可以显着提高触觉远程操作的透明度和稳定性。我们还强调,所提出的反馈结构是令人满意的,并且不违反最佳控制问题中定义的任何约束。据我们所知,这项工作是使用全身MPC框架的双边操纵器的双边远程操作的首次实现。
translated by 谷歌翻译
Multimodal deep learning has been used to predict clinical endpoints and diagnoses from clinical routine data. However, these models suffer from scaling issues: they have to learn pairwise interactions between each piece of information in each data type, thereby escalating model complexity beyond manageable scales. This has so far precluded a widespread use of multimodal deep learning. Here, we present a new technical approach of "learnable synergies", in which the model only selects relevant interactions between data modalities and keeps an "internal memory" of relevant data. Our approach is easily scalable and naturally adapts to multimodal data inputs from clinical routine. We demonstrate this approach on three large multimodal datasets from radiology and ophthalmology and show that it outperforms state-of-the-art models in clinically relevant diagnosis tasks. Our new approach is transferable and will allow the application of multimodal deep learning to a broad set of clinically relevant problems.
translated by 谷歌翻译
This work proposes a framework developed to generalize Critical Heat Flux (CHF) detection classification models using an Unsupervised Image-to-Image (UI2I) translation model. The framework enables a typical classification model that was trained and tested on boiling images from domain A to predict boiling images coming from domain B that was never seen by the classification model. This is done by using the UI2I model to transform the domain B images to look like domain A images that the classification model is familiar with. Although CNN was used as the classification model and Fixed-Point GAN (FP-GAN) was used as the UI2I model, the framework is model agnostic. Meaning, that the framework can generalize any image classification model type, making it applicable to a variety of similar applications and not limited to the boiling crisis detection problem. It also means that the more the UI2I models advance, the better the performance of the framework.
translated by 谷歌翻译
The success of Deep Learning applications critically depends on the quality and scale of the underlying training data. Generative adversarial networks (GANs) can generate arbitrary large datasets, but diversity and fidelity are limited, which has recently been addressed by denoising diffusion probabilistic models (DDPMs) whose superiority has been demonstrated on natural images. In this study, we propose Medfusion, a conditional latent DDPM for medical images. We compare our DDPM-based model against GAN-based models, which constitute the current state-of-the-art in the medical domain. Medfusion was trained and compared with (i) StyleGan-3 on n=101,442 images from the AIROGS challenge dataset to generate fundoscopies with and without glaucoma, (ii) ProGAN on n=191,027 from the CheXpert dataset to generate radiographs with and without cardiomegaly and (iii) wGAN on n=19,557 images from the CRCMS dataset to generate histopathological images with and without microsatellite stability. In the AIROGS, CRMCS, and CheXpert datasets, Medfusion achieved lower (=better) FID than the GANs (11.63 versus 20.43, 30.03 versus 49.26, and 17.28 versus 84.31). Also, fidelity (precision) and diversity (recall) were higher (=better) for Medfusion in all three datasets. Our study shows that DDPM are a superior alternative to GANs for image synthesis in the medical domain.
translated by 谷歌翻译
Recent advances in computer vision have shown promising results in image generation. Diffusion probabilistic models in particular have generated realistic images from textual input, as demonstrated by DALL-E 2, Imagen and Stable Diffusion. However, their use in medicine, where image data typically comprises three-dimensional volumes, has not been systematically evaluated. Synthetic images may play a crucial role in privacy preserving artificial intelligence and can also be used to augment small datasets. Here we show that diffusion probabilistic models can synthesize high quality medical imaging data, which we show for Magnetic Resonance Images (MRI) and Computed Tomography (CT) images. We provide quantitative measurements of their performance through a reader study with two medical experts who rated the quality of the synthesized images in three categories: Realistic image appearance, anatomical correctness and consistency between slices. Furthermore, we demonstrate that synthetic images can be used in a self-supervised pre-training and improve the performance of breast segmentation models when data is scarce (dice score 0.91 vs. 0.95 without vs. with synthetic data).
translated by 谷歌翻译
人工智能(AI)的努力是设计能够完成复杂任务的自主代理。也就是说,加强学习(RL)提出了学习最佳行为的理论背景。实际上,RL算法依靠几何折扣来评估这种最优性。不幸的是,这并不涵盖未来回报并没有达到成倍价值的决策过程。根据问题的不同,此限制会引起样本信息(由于饲料后额定值是指数衰减),并且需要其他课程/探索机制(以处理稀疏,欺骗性或对抗性奖励)。在本文中,我们通过通过延迟目标功能将折现问题提出来解决这些问题。我们研究了得出的基本RL问题:1)最佳固定解和2)最佳非平稳控制的近似值。设计的算法解决了表格环境上的​​硬探索问题,并在经典的模拟机器人基准上提高了样品效率。
translated by 谷歌翻译
尽管CNN的性能卓越,但将它们部署在低计算功率设备上仍然有限,因为它们通常在计算上昂贵。高复杂性的一个关键原因是卷积层与完全连接的层之间的连接,通常需要大量参数。为了减轻此问题,最近提出了一系列功能(BOF)合并。 BOF学习了一个字典,该字典用于编译输入的直方图表示。在本文中,我们提出了一种基于BOF Poling之上的方法,以确保学习词典的项目不是冗余的,以提高其效率。我们根据词典项目的成对相关性提出了一个额外的损失项,该词典的配对相关性补充了标准损失,以明确规范模型以学习更多样化和丰富的词典。提出的策略产生了BOF的有效变体,并进一步提高了其性能,而无需任何其他参数。
translated by 谷歌翻译
近年来,WiFi成为在室内找到一个人或设备的主要信息来源。将RSSI值作为具有已知位置的参考测量值(称为WiFi指纹打印),通常用于文献中出现的各种定位方法和算法中。但是,测量给定的WiFi指纹组之间的空间距离受到选择为地理空间距离建模的信号距离函数的选择。在这项研究中,作者提出了对机器学习的利用,以改善指纹之间的地理空间距离的估计。这项研究检查了从13个不同的开放数据集收集的数据,以提供广泛的表示,目的是用于任何室内环境中的通用模型。提出的新方法通过通过功能选择过程来检查一组常用的信号距离指标来提取数据特征,该过程包括特征分析和遗传算法。为了证明该研究的输出是独立的,所有模型均在培训和验证阶段在先前排除的数据集上进行了测试。最后,使用各种评估指标比较了各种机器学习算法,包括能够将测试床扩展到现实世界未经请求的数据集的能力。
translated by 谷歌翻译
时间序列分析中产生的最重要的问题之一是分叉或变化点检测。也就是说,给定时间序列的集合在不同的参数上,何时基础动力系统的结构发生了变化?对于此任务,我们转向拓扑数据分析(TDA)的领域,该领域编码有关数据形状和结构的信息。近年来,利用TDA的工具用于信号处理任务(称为拓扑信号处理(TSP)(TSP))的想法在很大程度上通过标准管道获得了很多关注,该标准管道计算出Takens嵌入产生的点云的持久同源性。但是,此过程受到计算时间的限制,因为在这种情况下生成的简单复合物很大,但也有很多冗余数据。因此,我们求助于编码吸引子结构的最新方法,该方法构建了代表有关何时在状态空间区域之间动态系统传递的信息的序数分区网络(OPN)。结果是一个加权图,其结构编码有关基础吸引子的信息。我们以前的工作开始寻找以TDA适合的方式包装OPN信息的方法。但是,这项工作仅使用网络结构,而没有采取任何行动来编码其他加权信息。在本文中,我们采取下一步:构建管道来分析使用TDA的加权OPN,并表明该框架为系统中的噪声或扰动提供了更大的弹性,并提高了动态状态检测的准确性。
translated by 谷歌翻译